Unreliable Failure Detectors for Asynchronousdistributed

نویسنده

  • Tushar Deepak Chandra
چکیده

It is well-known that several fundamental problems of fault-tolerant distributed computing, such as Consensus and Atomic Broadcast, cannot be solved in asyn-chronous systems with crash failures. These impossibility results stem from the lack of reliable failure detection in such systems. To circumvent such impossibility results, we introduce the concept of unreliable failure detectors that can make mistakes, and study the problem of using them to solve Consensus (and Atomic Broadcast). It is easy to solve Consensus using a \perfect" failure detector (one that does not make mistakes). But is perfect failure detection necessary to solve Consensus? We show that Consensus is solvable with unreliable failure detectors, even if they make an innnite number of mistakes. This leads to the following question: What is the \weakest" failure detector for solving Consensus? We introduce a notion of algorithmic reducibility that allows us to compare seemingly incomparable failure detectors. Using this concept, we show that one of the failure detectors that we introduce here is indeed the weakest failure detector for solving Consensus in asynchronous systems with a majority of correct processes. We also show that Consensus and Atomic Broadcast are equivalent in asyn-chronous systems. Thus all our results regarding the solvability of Consensus using failure detectors, apply to Atomic Broadcast as well. spent his childhood in various cities in India: Bombay, Calcutta and nally Kanpur. After completing high school at the Doon school, he went on to do a Bachelor of iii This thesis is dedicated to my parents who taught me how to think. iv Acknowledgements A large number of people contributed either directly or indirectly to this thesis. I was extremely fortunate to have Sam Toueg as my advisor. Without his guidance, I would still be trying to shave oo deltas from broadcast algorithms; without his cooking knowledge, I would never have learnt to cook pasta sauce \a la putanesca". Sam has put in as much work|if not more|towards this thesis, as I have. I am indebted to Vassos Hadzilacos, my other co-author, for his invaluable contributions to this work. Without his help, Chapter 4 would not have happened. Further, his critical readings of the work presented in Chapter 3 have greatly innuenced the way it is presented. The idea of using unreliable failure detectors came from the many invaluable discussions I had with the Isis folks, in particular Aleta Ricciardi and Ken Bir-man (also see RB91]). Rod Downey taught …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems

Unreliable failure detectors, proposed by Chandra and Toueg [2], are mechanisms that provide information about process failures. In [2], eight classes of failure detectors were defined, depending on how accurate this information is, and an algorithm implementing a failure detector of one of these classes in a partially synchronous system was presented. This algorithm is based on all-to-all comm...

متن کامل

About the Relationship between Election Problem and Failure Detector in Asynchronous Distributed Systems

This paper is about the relationship between Election problem and Failure Detector in asynchronous distributed systems. We first discuss the relationship between the Election problem and the Consensus problem in asynchronous distributed systems with unreliable failure detectors. Chandra and Toueg have stated that Consensus is solvable in asynchronous systems with unreliable failure detectors. B...

متن کامل

Revisiting the Relationship between Non-blocking Atomic Commitment and Consensus ?

This paper discusses the relationship between the Non-Blocking Atomic Commitment problem (NB-AC) and the Consensus problem in asynchronous systems with unreliable failure detectors. We rst connrm that NB-AC is harder than Consensus. In contrast to Consensus , NB-AC is impossible to solve with unreliable failure detectors even with a single crash failure. We deene a weaker problem than NB-AC, ca...

متن کامل

Lower Bounds with Unreliable Failure Detectors (brief Announcement)

This paper takes place in the context of fault tolerant distributed computing. We investigate the eeciency of decision algorithm using unreliable failure detectors. We prove some lower bounds for Consensus Problem. In particular, we show that the longest message chain of all algorithms using a strong failure detector is greater than the number of processes, no matter the number of faulty proces...

متن کامل

Revistiting the Relationship Between Non-Blocking Atomic Commitment and Consensus

This paper discusses the relationship between the Non-Blocking Atomic Commitment problem (NB-AC) and the Consensus problem in asynchronous systems with unreliable failure detectors. We rst connrm that NB-AC is harder than Consensus. In contrast to Consensus , NB-AC is impossible to solve with unreliable failure detectors even with a single crash failure. We deene a weaker problem than NB-AC, ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993